TK987 : Desing a Fast Large-Scale Face Recognition baxsed on Deep Neural Networks
Thesis > Central Library of Shahrood University > Electrical Engineering > PhD > 2023
Authors:
Mostafa Diba [Author], Hossein Khosravi[Supervisor]
Abstarct: Face recognition is one of the most popular methods in the field of biometric authentication and has various applications in various fields, including military, commercial and public authentication in everyday life. Face recognition technology provides the possibility to authenticate people through their face image. Face recognition has always had many challenges; Today, challenges such as appearance changes have been almost solved with the introduction of deep networks, but challenges such as the large scale of data and the large number of classes are still discussed. By increasing the number of classes to over 200, there is a heightened risk of classes becoming intertwined, making identification more challenging. The complexity intensifies when dealing with a limited number of samples per class, a challenge that has been mitigated through the implementation of deep learning. Deep architectures are specifically crafted to address challenges characterized by high complexity. To solve complex problems, methods are usually proposed to increase the depth and width of the models. Elevating the number of convolutional laxyers and concurrently increasing the number of filters in each laxyer poses a challenge of redundancy. This challenge impedes the training speed of models, and, concurrently, the training of these models demands robust hardware, which is often inaccessible in academic research settings. In this thesis, we successfully enhanced the speed in both the training and testing stages while preserving accuracy in face recognition through the introduction of a novel solution. To achieve this objective, we introduced a method for pruning deep networks that can identify and subsequently remove inactive filters within the model. The proposed method accurately identified these filters. It was applied to two face recognition models utilizing VGG16 and ResNet50V2 architectures. In the face recognition model baxsed on VGG16, the accuracy increased by 1.74%, accompanied by a reduction of 26.85% in the number of convolutional parameters and a decrease of 47.96% in FLOPS (Floating Point Operations Per Second). For the face recognition model baxsed on the ResNet50V2 architecture, we implemented the ArcFace method. The removal of inactive filters in this model resulted in a slight decrease in accuracy, up to 0.11%. However, we achieved a substantial reduction of 59.38% in convolutional parameters and a 57.29% reduction in FLOPS. In a deep learning model, the greater the number of laxyers, the more abstract features the model can extract. This is attributed to the incorporation of non-linear activation functions in each laxyer. Removing the activation function from a deep architecture, regardless of the number of laxyers, results in a linear classifier. Hence, it can be asserted that the activation function is a pivotal element in a deep learning model. The activation function plays a fundamental role in feature extraction, and it significantly influences the convergence speed of the model. In this thesis, we introduce a novel face recognition architecture named SNResNet, which is baxsed on the Inception-ResNet architecture. The Inception-ResNet architecture proves effective in computer vision applications; however, it demonstrates limitations, including computational complexity, high memory consumption, and data dependency. The Inception-ResNet architecture is effective in computer vision applications, yet it presents limitations such as computational complexity, high memory consumption, and data dependency. Furthermore, its utilization of ReLU activation functions and Softmax loss functions may not be optimal for large-scale face recognition tasks. The proposed SNResNet employs TripletLoss as a loss function for training the model on large datasets. The ReLU activation function discards all negative values, which, in certain applications, can diminish the accuracy of the model. To address this issue, we introduced a new activation function called Rish, which demonstrates improved performance. Additionally, we optimized the Inception-ResNet-B and Inception-ResNet-A blocks by incorporating the SqNxt block to manage the computational cost of the model. The proposed model achieved an accuracy of 99.68% on the benchmark LFW databaxse, surpassing the standard model's accuracy of 98.85%. Notably, the proposed model exhibited a 15.56% reduction in convolutional parameters and a 15.61% decrease in FLOPS compared to the standard model. These results indicate a lower computational cost and a faster model for face recognition.
Keywords:
#Face Recognition #Network Pruning #Deep Learning #Inception-ResNet #VGG16 #SqNxt block #ResNet50V2 Keeping place: Central Library of Shahrood University
Visitor: